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We study solutions to nonlinear stochastic differential systems driven by a multi- 
dimensional Wiener process. A useful algorithm for strongly simulating such stochas- 
tic systems is the Castell-Gaines method, which is based on the exponential Lie 
series. When the diffusion vector fields commute, it has been proved that at low 
orders this method is more accurate in the mean-square error than corresponding 
stochastic Taylor methods. However it has also been shown that when the diffusion 
vector fields do not commute, this is not true for strong order one methods. Here 
we prove that when there is no drift, and the diffusion vector fields do not com- 
mute, the exponential Lie series is usurped by the sinh-log series. In other words, 
the mean-square error associated with a numerical method based on the sinh-log 
series, is always smaller than the corresponding stochastic Taylor error, in fact to 
all orders. Our proof utilizes the underlying Hopf algebra structure of these series, 
and a two-alphabet associative algebra of shuffle and concatenation operations. We 
illustrate the benefits of the proposed series in numerical studies. 
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1. Introduction 

We are interested in designing strong series solutions of nonlinear Stratonovich 
stochastic differential systems of the form 

yt = yo + y2 ^»(y-)dT^;- 

Here (W^, . . . , W^) is a d-dimensional Wiener process and yt G for some N £ 
N and aU t e We suppose that V,: M^, i = l,...,d, are smooth 

non-commuting vector fields which in coordinates are Vi — X^jLi ^i^vj- Repeated 
iteration of the chain rule reveals the stochastic Taylor expansion for the solution 
to the stochastic differential system. Indeed for any smooth function /: M 
we have the formal stochastic Taylor series expansion 

f°yt = foyo+ ^ Jw{t)Vw o f oyQ. 

Here A+ is the collection of non-empty words over the alphabet A = {1, . . . , rf}. We 
adopt the standard notation for Stratonovich integrals, if w = oi . . . a„ then 

JQ Jo 
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We have also written the composition of the vector fields as = V^^ o o • • • o V^„ • 
We define the flow-map ipt as the map such that 

Vt o f o yo = f o yt. 

It has a formal stochastic Taylor expansion of the form 

<Pt = id+ ^ Jw{t) V^. 
weA+ 

Note that ft^fs = <^i+s for all non-negative s and cpo = id, the identity mapping. 

Classical strong numerical methods are based on truncating the stochastic Tay- 
lor expansion for the flow-map and applying the resulting approximate flow-map 
(pt over successive small subintervals of the global interval of integration required 
(see Kloeden and Platen 1999 or Milstein 1994). An important and expensive in- 
gredient in all numerical methods is the strong simulation/approximation of the 
required retained multiple integrals Jw{t), on each integration step. Hero wo will 
take their suitable approximation as granted (see, for example, Wiktorsson 2001 for 
their practical simulation). Now let F: Diff(R^) Diff(M^) be a smooth function. 
We can also construct flow approximations from ipt via the following procedure. 

• Construct the new series ipt = F{<Pt)- 

• Truncate this series to produce the finite expansion i/jt- 

• Reconstruct an approximate flow-map as (fit = F~^{'4't)- 

• The "flow error" is the flow remainder Rt = (pt — ipt- 

• An approximate solution is given by yt = 'Pt ° Ho- 



• The mean-square error in this approximation is \\Rtoyo 




For the special case F — log, i.e. the logarithm function, this procedure was outlined 
by Castell and Gaines (1996). The resulting series Vt = logi^t is the exponential 
Lie series, which lies in R(Vi, . . . , Vd), the non-commutative algebra of formal series 
generated by the vector fields Vi, . . . ,Vd. Indeed any truncation i/^t, with multi- 
ple integrals replaced by suitable approximations, also lies in K(T4, • • • , Vd) and is 
therefore a vector field. Hence (pt = exp ijjt and an approximation yt to the solution 
can be constructed by solving the ordinary differential system for u = u{t): 

u' = 'iptou 

for r € [0,1] with u{0) = yo- The solution to this system at time r = 1, itself 
approximated by an ordinary differential numerical method, is u{l) ~ yt- 

So far, what has been proved for the Castell- Gaines method? Castell and Gaines 
(1995, 1996) prove that the strong order one-half method constructed in this way is 
always more accurate than the Euler-Maruyama method. Indeed they prove that 
this method is asymptotically efficient in the sense of Newton (1991). Further in 
the case of a single driving Wiener process {d = 1), they prove the same is true 
for the strong order one Castell-Gaines method. By asymptotically efficient we 
mean, quoting from Newton (1991), that they "minimize the leading coefficient 
in the expansion of mean-square errors as power series in the sample step size". 
Lord, Malham and Wiese (2008) and Malham and Wiese (2008) proved that when 
the diffusion vector fields commute, but not necessarily with the drift vector field, 
then the strong order one and also three-halves Castell-Gaines methods have a 
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mean-square error that is smaller than the mean-square error for the corresponding 
stochastic Taylor method. However Lord, Malham and Wiese (2008) also prove 
that for linear diffusion vector fields which do not commute, the strong order one 
Castell-Gaines method does not necessarily have a smaller mean-square error than 
the corresponding stochastic Taylor method. Indeed there are regions in the phase 
space where the local error of the stochastic Taylor method is smaller. 

Hence we are left with the following natural question when the diffusion vector 
fields do not commute. Is there a solution series ansatz ipt = F((pt) for some function 
F, for which the mean-square error of the numerical method so constructed, is 
always smaller than the corresponding stochastic Taylor method? In this paper we 
answer this question, indeed under the assumption there is no drift, we: 

(1) Prove the mean-square error of an approximate solution constructed from the 
sinh-log expansion (with F = sinhlog above) is smaller than that for the stochastic 
Taylor expansion, to all orders. 

(2) Prove that a numerical method based on the sinh-log expansion has a global 
error that is smaller than that for the corresponding stochastic Taylor method. 

(3) Utilize the Hopf shuffle algebra of words underlying such expansions; in fact 
we retract to a new associative algebra of concatenation and shuffle operators, that 
acts on the Hopf shuffle algebra of words. 

(4) Underpin our theoretical results with concrete numerical simulations. 

We examine and interpret these statements in detail next, in Section 2, where we 
answer the following immediate questions. First, what precisely, is the sinh-log ap- 
proximation and the properties we prove for it? Second, how do we prove the result; 
what is the connection with Hopf shuffle algebras and the concatenation-shuffle op- 
erator algebra mentioned? In Section 3 we provide the technical specification of 
the concatenation-shuffle operator algebra and prove some polynomial identities 
important for our main result. In Section 4 we present our main result. Then in 
Section 5 we discuss the global error result above, and perform numerical simula- 
tions confirming our results. We give some concluding remarks in Section 6. 

2. Principal ideas 

The goal of this section is to motivate and make precise statements about the 
sinh-log approximation we propose. 

(a) Stochastic series expansion approximations 

We begin by outlining the approximation procedure presented in the introduc- 
tion in more detail. Suppose the smooth function F has a real series expansion of 
the form 

oo 

F(a;)=5^Cfe(x-l)^ 

fe=i 

with some finite radius of convergence about x = 1, and for some coefficient set 
{Ck- k ^ 1} with Ci = 1. Given the flow-map (pt, we construct the series 

oo 

^t = F(<^t) = ^Cfc(<^t-id)^ 

k=l 
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We substitute, into this series expansion, the stochastic Taylor series for the flow- 
map ipt- After rearrangement, we get 

weA+ 

where 

\w\ 

We truncate the series, dropping all terms with words w of length \w\ ^ n + 1. 
This generates the approximation ipt, once we have replaced all retained multiple 
integrals Ju{t) by suitable approximations. Then, in principle, we construct the 
solution approximation yt from yt = <ft ° Vo, where 

^t=F-\iPt). 

Performing this reconstruction is nontrivial in general (see Section 5). 

For example, to construct the exponential Lie series approximation of Castell 
and Gaines (1995), we take F = log and construct the series 



ipt = log =YCk {ft - id)* 



fe=i 

where Ck = for k ^ 1. Substituting the stochastic Taylor series for (pt, 

the series expansion for ijjt above becomes the exponential Lie series (see Strichartz 
1987, Ben Arous 1989, Castell 1993 or Baudoin 2004 for the full series) 

wek+ 

where for w = ai . . . o„ we have V[^] = [K^ , [K^ , • • • , [K„_i , K„] • • •] and 

(-l)'-(-) 

Here is the group of permutations of the index set {1, . . . , |w|}, e{a) is the 
cardinality of the set e {1, . . . , |w| — 1}: > a{j + 1)}, and D^^^l^'^ is the 
combinatorial number: \w\ — 1 choose e{a). Truncating this series and using suit- 
able approximations for the retained J^^-Iq^, produces V't. We then reconstruct 
the solution approximately using (pt = exp-ipf The actual solution approximation 
yt = 'Pt°yo is then computed by solving the ordinary differential equation generated 
by the vector field V't- 

To construct the sinh-log approximation, we take F = sinh log so that 

oo 

ipt = sinh log ipt = ^{ipt- ipt^) = YCk {<Pt - id)*', 

fe=i 
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where Ci = 1 and Cfc = |(— for A; ^ 2. Again substituting the stochastic 

Taylor series for cpt we get the series expansion for ipt shown above with terms 
Kw{t)Vw where the coefficients Kyj{t) now exphcitly involve the sinh-log coefficients 
Cfc. Then, in principle, we can reconstruct the solution approximately using 



tpt = expsinh ^{ipt) = V't + y id + tpf. 

Remark 2.1. Suppose the vector fields Vl, i = 1, . . . ,d are sufficiently smooth and t 
sufficiently small (but finite). Then the approximation (ptoyo constructed using the 
sinh-log expansion, as just described, is square-integrable. Further if y is the exact 
solution of the stochastic differential equation, and ipt includes all terms KwVw 
involving words of length w ^ n, then there exists a constant C(n, such that 
WVt — 'Pt ° VoWl^ ^ Cin, |y|) t'""*"^'^^; here | • | is the Euclidean norm. This follows 
by arguments exactly analogous to those for the exponential Lie series given in 
Malham & Wicse (2008; Theorem 7.1 and Appendix A). 

Remark 2.2. Naturally, the exponential Lie series ijjt and and its truncation i/jt lie 
in the Lie algebra of vector fields generated by Fi, . . . , V^. Hence exp(r^t) is simply 
the ordinary flow-map associated with the autonomous vector field ipt ■ 

Remark 2.3. The exponential Lie series originates with Magnus (1954) and Chen 
(1957); see Iserles (2002). In stochastic approximation it appears in Kunita (1980), 
Flciss (1981), Azcncott (1982), Strichartz (1987), Ben Arous (1989), Castcll (1993), 
Castell & Gaines (1995,1996), Lyons (1998), Burrage & Burrage (1999), Baudoin 
(2004), Lord, Malham & Wiese (2009) and Malham & Wiese (2008). 

(6) Hopf algebra of words 

Examining the coefficients in the series expansion for (/; above, we see that 
they involve linear combinations of products of multiple Stratonovich integrals (we 
suspend explicit t dependence momentarily). The question is, can we determine 
explicitly? Our goal here is to reduce this problem to a pure combinatorial one. 
This involves the Hopf algebra of words (see Reutenauer 1993). 

Let K be a commutative ring with unit. In our applications we take IK = M 
or K = J, the ring generated by multiple Stratonovich integrals and the con- 
stant random variable 1, with pointwise multiplication and addition. Let K(A) 
denote the set of all noncommutative polynomials and formal series on the alphabet 
A = {1,2, ... ,d} over K. With the concatenation product, IK(A) is the associative 
concatenation algebra. For any two words u, u G K(A) with lengths \u\ and \v\, 
we define the shuffle product uluv to be the sum of the words of length |w| -I- \v\ 
created by shuffling all the letters in u and v whilst preserving their original order. 
The shuffle product is extended to K{A) by bilinearity. It is associative, and dis- 
tributive with respect to addition, and we obtain on K(A) a commutative algebra 
called the shuffle algebra (note 1 ww = wluI = w for any word w where 1 is the 
empty word). The linear signed reversal mapping a £ End(K(A)): 

aow = (— l)"a„ . . . oi, 

for any word w = ai . . . a„ is the antipode on K(A). There are two Hopf algebra 
structures on K{A), namely (K(A), c, 6, rj, s, a) and {K{A),s, 6', rj, e, a), where r] and 
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e are unit and co-unit elements, and S and S' respective co-products (Reutenauer, 
p. 27). We define the associative algebra using the complete tensor product 

n = ]K{A)^K(A> 

with the shuffle product on the left and the concatenation product on the right 
(Reuntenauer, p. 29). The product of elements u(E)x,v^yGTl is given by 

(u ^ x){v (g) y) = (uLuv) (g) (xy) , 

and formally extended to infinite linear combinations in 7i via linearity. As a tensor 
product of two Hopf algebra structures, H itself acquires a Hopf algebra structure. 



(c) Fullback to Hopf shuffle algebra 

Our goal is to puUback the flow- map (p and also ip toH (with K = M). Let V be 
the set of all vector fields on M^; it is an M-module over C°° (R^) (see Varadarajan 
1984, p. 6). Wc know that for the stochastic Taylor series the flow-map (p G J(V) 
(with vector field composition as product). Since JJ(V) = 0„>o J ® Vn, where V„ 
is the subset of V of compositions of vector fields of length n, we can write 

(y9 = l®idv+ ^ JtufSiVw. 

The Unear word-to-vector field map k: M(A) — > V given by k: w i-^ K,' iw a con- 
catenation homomorphism, i.e. k{uv) = k{u)k{v) for any u,v £ A+. And the linear 
word-to-integral map /x: IR(A) J given by /x : w i-^ is a shuffle homomorphism, 
i.e. Ijl{uluv) = /j,{u)fj,{v) for any u,v € A+ (see for example, Lyons, et. al. 2007, 
p. 35 or Reutenauer 1993, p. 56). Hence the map ii® n: Tt ^ ®n>o^ ® V„ is a 
Hopf algebra homomorphism. The pullback of the flow-map by /U ® «; is 

{jJL ® k)* (fi = 1 ® 1 -\- W®W. 

All the relevant information about the stochastic flow is encoded in this formal 
series; it is essentially Lyons' signature (see Lyons, Caruana and Levy 2007; Baudoin 
2004). Hence by direct computation, formally we have 



{lIlSiK)*1p= 2_^Cfe((/X(g)K)>-l«)l) 

fc^l \Mi,...,Mfc£A+ 



. . LUUfc) (g) (Ul . . . Uk) 



weA' \k=i ui....,ukeA+ 

tO=«i...U(j 

= {K ow) (g) w, 
weA" 



Ml LU . . . LU Ufc (8> W 
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where ii" o w is defined by 

\w\ 

Kow = '^^Ck Ui LU . . . m Ufe, 

k—1 Ul,...,Uk^^'^ 
W=Ui...Uk 

corresponds to Ku, (indeed it is the puUback ^i*Kw to see Section 3). Having 
reduced the problem of determining K o w to the algebra of shuffles, a further 
simplifying reduction is now possible. 

Remark 2.4. The use of Hopf shuffle algebras in stochastic expansions can be traced 
through, for example, Strichartz (1987), Reutenauer (1993), Gaines (1994), Li & 
Liu (2000), Kawski (2001), Baudoin (2004), Murua (2005), Ebrahimi-Fard & Guo 
(2006), Manchon & Paycha (2006) and Lyons, Caruana & Levy (2007), to name 
a few. The paper by Munthe-Kaas & Wright (2008) on the Hopf algebraic of Lie 
group integrators actually instigated the Hopf algebra direction adopted here. A 
useful outline on the use of Hopf algebras in numerical analysis can be found therein, 
as well as the connection to the work by Connes & Miscovici (1998) and Connes & 
Kreimer (1998) in renormalization in perturbative quantum field theory. 



{d) Retraction to concatenations and shuffles 

For any given word w = ai . . . a„+i we now focus on the coefficients K o w. We 
observe that it is the concatenation and shuffle operations encoded in the structural 
form of the sum for K ow that carry all the relevant information. Indeed each term 
7ii Lu ... Lu is a partition of w into subwords that are shuffled together. Each 
subword ui is a concatenation of \ui\ letters, and so we can reconstruct each term 
of the form u\ua . . . luUk from the following sequence applied to the word w: 

cl«i|-l5cl«2|-ls...Scl"'=|-^ 

where the power of the letter c indicates the number of letters concatenated to- 
gether in each subword Ui and the letter s denotes the shuffle product between the 
subwords. In other words, if we factor out the word w, we can replace K ow hy & 
polynomial K of the letters c and s. In fact, in Lemma 3.3 we show that 

n 

i^ = ^Cfc+i(c"-'=ms'=). 

Thus we are left with the task of simplifying this polynomial in two variables (lying 
in the real associative algebra of concatenation and shuffle operations). 

Remark 2.5. We devote Section 3 to the rigorous justification of this retraction, the 
result above, and those just following. A key ingredient is to identify the correct 
action of this algebra over (1K(A), s,a). 

Remark 2.6. There is a natural right action by the symmetric group §„ on K(A)^, 
the subspace of K(A) spanned by words of length n (Reutenauer 1993, Chapter 8). 
This action is transitive and extends by linearity to a right action of the group alge- 
bra K(Sri) on IK(A)jj. We are primarily concerned with shufHes and multi-shuffles, a 
subclass of operations in IK(S„), and in particular, we want a convenient structure 
that enables us to combine single shuffles to produce multi-shuffles. 
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(e) Stochastic sinh-log series coefficients 

The coefficient set {Ck- > 1} determines the form of the function F. Our 

ultimate goal is to show order by order that the stochastic sinh-log expansion guar- 
antees superior accuracy. Hence order by order we allow a more general coefficient 
set {Ck : k ^ 1}, and show that the sinh-log choice provides the guarantee we seek. 

Definition 2.1 (Partial sinh-log coefficient set). Define the partial sinh-log coeffi- 
cient sequence: 

'l, k = l, 

i(-l)" + e, fc = n+l, 
where e G M. 

With the choice of coefficients {Ck}, we see that 

n 
fe=0 

This has an even simpler form. 

Lemma 2.1. With the partial sinh-log coefficient sequence {Ck}, the coefficient K 
is given by 

jr= i(c"-a„) +es", 
where an is the antipode for words of length n-\-l. 

Proof. We think of (c — ,s)", with expansion by concatenation, as the generator for 
the polynomial (in K(]B)^; see Section 3) defined by 



(c-s)" ==^(-l)'=(c"-'=ms'= 



Then by Lemma 3.2 in Section 3 we have the following identity 

(c - s)" = -a„. 

Hence using the sinh-log coefficients and splitting the ffist term, we have 



fc=0 



c" + i(c-s)"-Hes" 



2 '2 

l„n 1. 
2^ 



□ 



Corollary 2.2. For any word w = ai . . . a„+i we have 

K ow = ^(w — aow) -l-eaiuj ... lu a„+i , 

and thus 

n+l 
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Remark 2.7. For the stochastic sinh-log expansion the coefficients K^i, thus have an 
extremely simple form. There are several strategies to prove this form. The result 
can be proved directly in terms of multiple Stratonovich integrals by judicious use 
of their properties, the partial integration formula and induction — the proof is long 
but straightforward. That this strategy works is also revealed by the strategy we 
have adopted in this paper, which we believe is shorter and more insightful. 

3. Concatenation-shuffle operator algebra 

(a) Algebra and action 

With B = {c, s}, let IK(B) denote the set of all noncommutative polynomials 
and formal scries on B over IK. We can endow K(B) with the concatenation product 
or shuffle product, and also generate an associative concatenation- shuffle operator 
algebra on K(B) as follows. 

Definition 3.1 (Shuffle gluing product). The map g: K(B) O K(B) K(B), the 
associative and bilinear shuffle gluing product, is defined by 

g : bi iSi b2 <->■ 

i.e. we concatenate the element bi with s and the element 62 in K(B) as shown. 

Endowed with the shuffle gluing product, K(B) is an associative algebra with 
unit element s~^ (see Reutenauer 1993, p. 26, for the definition of s~^). We define 
the graded associative tensor algebra /C by 

/C = 0]K(B)„®K(A)„^,, 

n^O 

with the shuffle gluing product on the left in K(B)^ and concatenation product on 
the right in K(A)^_|_^ — here IK(B)^ and K.(A)^_|_^ denote the subspaces of K(B) and 
K(A), respectively, spanned by words of length n and n + 1, respectively. Thus if 
61 (8> ui and 62 <8> U2 are in K, then their product is 

(61 (g) ui){b2 (8) U2) = (61S62) <8> (W1U2), 

with extension to K. by bilinearity. 

We now define the homomorphism C: /C ^ (IK(A), s, S' , a) as follows. Any word 
b G B+, for some k gN and ni, . . . , € N U {0}, can be expressed in the form 

b = c^^sc"'^sc"'^ ...sc'"'. 

There are (k—l) occurrences of the symbol 's' in b, and ni+n2-\ hn^+fc — 1 = |6|. 

Here c" represents the word consisting of c multiplied by concatenation n times, 
= 1; similarly for s" and s°. Then we define 

b®W^boui = LU U„2 m . . . m , 

where w = w„jW„2 • • • the successive subwords ii„j, 7i„2,. . . have re- 

spective lengths ni + 1, n2 + 1,. . . , + 1. Note the sum of the lengths of the 
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subwords is n + 1. The map ^ extends by linearity to K.. The c-symbol indicates 

a concatenation product and the s-symbol a shuffle product in the appropriate n 
slots between the n + 1 letters in any word w on A+ of length n + 1. That C is a 
homomorphism from IC to K(A) follows from: 

C((&1 O Ui){b2 O U2)) = C((&lS&2) ® (U1W2)) = Cih ® Ul) LuC(62 M2)- 

Definition 3.2 (Partition orbit). We define the (shujfle) partition orbit, 0(w), of 
a word w G A+ to be the subset o/K(A) whose elements are linear combinations of 
words constructed by concatenating and shuffling the letters of w = a\ . . . 

©(■u;) = {span(Mi Lu . . . Lu Wfc) : wi . . . Ufc = w; Ui, . . . , Ufe e A"*"; fc £ {1, . . . , 

For any u G 0(w) there exists a 6 G IK(B)|^|_j^ such that u = C,{b w). Hence 
we can consider the preimage of 0(«;) under C, m. K, given by C,~^G>{w) = {b®w € 
K: C,{b®w) G <Q{w)}. Thus any element in 0{w) can be identified with an element 
b®w €l C,~^0{w) for a unique b G K(B)|^|_j and there is a natural projection map 

7r:OH^]K(B)|^l_,. 

(6) Polynomial identities 

Here wc prove a sequence of lemmas that combine to prove omv main results. 
The aim of the first two lemmas is to establish a form for the antipode a as a 
polynomial in the concatenation-shuffle operator algebra (]R(B), gi). We shall denote 
the antipode in End(M(A)^^^) by a„; it sign reverses any word w G M(A)^^^. 

Lemma 3.1 (Partial integration formula). The partial integration formula applied 
repeatedly to the multiple Stratonovich integral Jyj, where w = ai...a„+i, pulled 
back to M(B)|^|_-^, is given by 

a„ = -c" - ^ c''san-k-i- 
fe=o 

Proof. Repeated partial integration on the multiple Stratonovich integral Jw with 
w — ai . . . a„+i, pulled back to M(A)^_|_j via the word-to- integral map fx generates 
the identity: 

ai . . .a„+i = (ai . . .a„) ijja„+i - (ai . . . a„_i) m (o„+ia„) H h (-l)"a„+i . . . oi. 

After rearrangement, the projection of this identity in 0{w) onto K(B)^ via tt, using 
the definition for a„, generates the identity shown. □ 

Lemma 3.2 (Antipode polynomial). The antipode a„ G End(R(A)^_j_^) and poly- 
nomial — (c — s)" G IK(B)^ are the same linear endomorphism on IR(A)^^^; 

a„ = -(c- s)". 
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Proof. The statement of the lemma is trivially true for n = 1,2. We assume it is 
true for fc = 1, 2, . . . , n — 1. Direct expansion reveals that 



(c - s)" = c" - ^ c''s{c - s)"-'=-^ 



fe=0 

By induction, using our assumption in this expansion and comparing with the 
partial integration formula in Lemma 3.1 proves the statement for k = n. □ 

Lemma 3.3. The projection of K ow e Q{'w) onto R(B)^ via n generates 

n 

K = Y^Ck+i{c^-''^s>'). 

k=0 

Proof. For any w G A+ with \w\ = n + 1 we have 

n+l 

Kow= ^CfcC((c""^''"^'Lus'=-i)(8)w;) 
fc=i 

n 

= ^Cfe+iC((c"-'=ms'^)®«;) 

fe=0 

Projecting this onto ]R(B)^ establishes the result. □ 

4. Mean-square sinh-log remainder is smaller 

Suppose the flow remainder associated with a flow approximation (ft is 

Rt ■■= vt - 'fit- 

The remainder associated with the approximation ijt = (fit ° Jja is thus Rt o jjq. We 
measure the error in this approximation, for each yo G M^, in mean-square by 

\\Rtoyo\\h=E{{Rtoyor{Rtoyo)). 

If we truncate tp = F(lp) to tp, including all terms Vyj with words of length \w\ ^ n, 
suppose the remainder is r, i.e. we have 

tp = tp + r. 

Then the remainder to the corresponding approximate flow ip^^ = F~^{ijj), taking 
the difference with the exact stochastic Taylor flow ip^^, is given by 

R^^ = ip''* - 

= F-i(V.)-F-i(Vi) 

= F-\ij + r) - F-\i>) 

= r - C2{ri) + 'ipr) + 0{ip'^r), 
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where we can ignore the C2 (r^/i+^r) and 0{'ip'^r) terms in this section — we comment 

on their significance in Section 5. We compare this with the stochastic Taylor flow 
remainder i?^' := (p^' — 1^^', where (^^' is the stochastic Taylor flow series truncated 
to include all terms with words of length \w\ < n. Indeed, we set 

D DSt Dsl 

/ 1 . — / 1 — / 1 . 

and use the norm to measure the remainder. Hence for any data j/o, we have 

Wov4l. = \\W'oy4l,+E, 
where the mean-square excess 

E:=E{Royoy {R'' oyo)+E (R^' o yo)" {Ro yo) +E {Ro y^Y (Ro yo). 
If E is positive then R^^ o yo is smaller than R^^ o yo in the norm. 

Theorem 4.1. Suppose we construct the finite sinh-log expansion -0 using the par- 
tial sequence of sinh-log coefficients {Ck}, and truncate ip producing tp which only 
includes terms with words w with \w\ < n. Then the flow remainders for the sinh-log 

and the corresponding stochastic Taylor approximations are such that, for any data 
yo and order n S N, the mean-square excess is given by 



E = Eq — eEi — 



where Eq > 0, E2 > and 



where, for n even, we have 



El 



El, if n even, 
0, ifn odd, 



Ei= ^iu,v){Vuoyof{Vyoyo), 

\u\ = \v\=n+l 

and 

(n-t-l n+1 \ 

^{Ju + jpou) n + + jpov) yijuA > 0. 
i=l i=l / 

Here p is the unsigned reversal mapping, i.e. if w = ai . . . Un then pow = an ■ ■ - ai. 

Proof. If we truncate the sinh-log series flow-map including all integrals associated 
with words of length n, the remainder is given by 



\w\^n+l 

where henceforth we will ignore integrals in the remainder with \w\ ^ n + 2. Recall 
that from Corollary 2.2 we have 

n+l 
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The corresponding stochastic Taylor flow-map remainder is 



|ti;|=n+l 



The difl'erence between the two is 



\w\=n+l 

where Jw = Jw — Ky, and is given by 

n+l 
i=l 

The mean-square excess to the sinh-log remainder is E which at leading order is 

^ E {JuKy + KuJv + JuJv) {Vu o 2/o)^(V; o yo). 
\u\ — \v\—n-\-l 

We need to determine whether this quantity is positive definite or not. We refer 
to Jy,Ky -\- Ky,Jy as the cross- correlation terms and JuJv as the auto- correlation 
terms. The forms for Ku and Ju imply that: 

(n+l n+l ^ 

2^Ju Jotou) Jvi ~t~ 2^^^ Jaov) 



i=l 



Consider the zero order e'' term. Using Lemma 4.3, the expectation of the cross- 
correlation term therein (the first term on the right above) is zero. Hence we have 



Eo= E(i(J„ + Jaou){Jv + Jaov)) (K ° VoV {Vy O yo) 

\u\ = \v\=n-\-l 

= E| J2 ^(»^« + '^ao„)(K°2/o) J I Yl Wv + Jaov) {Vv o yo)\ 
\|«|=n+l / \|i;|=n+l / 

>0. 

At the next order , the terms shown are solely from cross-correlations — ^with the 

auto-correlation terms cancelling with other cross-correlation terms. When n is odd 
the expectation of this term is zero, again using Lemma 4.3. When n is even we 
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get the expression for Ei stated in the theorem. Finally at order the coefficient 
shown is the auto-correlation term multiplied by minus one. Explicitly we see that 

E2= IE ( ]J J„. J„^. J {Vu o yoT (K o yo) 

\u\ = \v\=n+l 

/ n+1 \T / N 

\ \u\=n+l i=l J \ \v\=n+l j=l / 

> 0. 

Combining these results generates the form for E stated. □ 

Corollary 4.2. When n is odd, E is positive and maximized when e = 0. When n is 
even, it is positive at e = 0, but maximized at a different value of e; the maximizing 
value will depend on the vector fields. 

Lemma 4.3. For any pair u,v G A+ we have that 

Proof. Every Stratonovich integral Jw is a linear combination of ltd integrals 

where the set D(t«) consists of w and all multi-indices u obtained by successively 
replacing any two adjacent (non-zero) equal indices in w by 0, see Kloeden and 
Platen (1999), equation (5.2.34). Since all indices in w are non-zero by assumption, 
the constant c„ is given by 

where n{u) denotes the number of zeros in u. Since adjacency is retained when 
reversing an index, it follows that 

Jpow ~ ^ ^ eulpou- 

Lemma 5.7.2 in Kloeden and Platen (1999) implies that the expected value of the 
product of two multiple Ito integrals is of the form 

£(.),£(.), 5:(^.(.) + ^.(.)), n uZ)\h(v)\ 

for some function /. Here ^{u) denotes the number of non-zero indices in w, while 

fco {n) denotes the number of zero components preceding the first non-zero compo- 
nent of u, and ki(u), for i = 1, . . . ,£{u), the number of components of u between 
the i-th and {i + l)-th non-zero components, or the end of u if i = i{u). It follows 
that 

k^{u) = ki(^^)_,{pou). 
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Since all other operations in the arguments of / are unchanged by permutations in 
u and V, we deduce that 

and consequently, 

u'eo{u) 
v'eo{v) 

□ 

5. Practical implementation 

(a) Global error 

We define the strong global error associated with an approximate solution yx 
over the global interval [0,T] as £ := Wyr — VtWl'^- Suppose the exact yr and 
approximate solution fjx are constructed by successively applying the exact and 
approximate flow-maps 'Ptm,tm+i ^^'^ '^tm,*m+i on M successive intervals [tm,tm+i]-, 
with tm = mh for m = 0, 1, . . . , M — 1 and h = T/M as the fixed stepsize, to the 
initial data yo- A- straightforward calculation shows that up to higher order terms 
we have 

£2 =E(7^oyo)^(7^ot/o), 

where 

M-l 

TO=0 

and Rtrn,tm+i '■= V>t,n,tm+i-'Ptm,tm+i (^66 Lord, Malham and Wiese 2008 or Malham 
and Wiese 2008). Note that the flow remainder Rt^.t„^+i always has the form 

Rt^,t^+i = X Kw{tm)Vw, 

where for the sinh-log series = \{Jw — Jaow)^ for the exponential Lie series 
Kyj = K[yjj (for each term in the linear combination V[^]) and for the stochastic 
Taylor series Kw = Jw Substituting this into TZ we see that £^ has the form 



, l^^^l ^ m. f-A'rrt / 



\u\^n+l ^ m l^m 
\v\'^n+l 

where 

Ve,m{u,v) = ¥.{ipte+i,tM °Ku{te)VuO(fto,te°yoy{Vt,n+i,tM°Kv{trn)VvO<fito,trn °yo)- 

This formula outlines the contribution of the standard accumulation of local errors, 
over successive subintcrvals of the global interval of integration, to the global error. 
Note that to leading order we have 

Vm,m{u,v) = E{Ku{tm)Ky{tm)) {Vu O yoY (Vy O yo) . 
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For the term Ve^rn{u,v), we focus for the moment on the case m < £ (our final 
conclusions below are true irrespective of this). At leading order we have 

0toM = <^t„+i,ff O f id+ ^ Ka{tm)Va + 
^ |a| = l 

= id+ ^ Ka{tm)Va + --- , 
\a\=l 

and 

^ |h|=i 

= id+ X Kbitm)Vb + --- . 
|6| = 1 

Substituting these expressions into the form for Ve,m{u, v) above we get 

Ve,Tn{u, v) 

= E(if„(t,))E(if,(t„))(K oyo)^(K oyo) 

|a|=l 

+ J2 HKu{ti)Kb{k))E{K,{tm)){Vuoyor{VboV,oyo) 

|6| = 1 

+ X E(i^„(t^)i^b(t^))E(ii:„(f„)if,(t„))(KoKo2/or(HoKoyo)- 

|a|=l 
|6|=1 

This breakdown allows us to categorize the different mechanisms throiigh which 
local errors contribute to the global error at leading order. Indeed in the local flow 
remainder R we distinguish terms with: 

(1) zero expectation: terms Kyj with = n + 1 of zero expectation generate 
terms of order /i" in £^ through two routes, through Vm,m and the last term in 
the expression for Vi^m{u,v) just above. In Vm.m they generate order h"''^^ terms, 
and the single sum over m means that their contribution to the global error S"^ 
is order Mh"'^^ = 0{h"-). In the last term in Vermin, v), they generate, when the 
expectations of the products indicated are non-zero, terms of order the double 
sum over £ and m is then order /i". Higher order terms with zero expectation 
simply generate a higher order contribution to the global error. 

(2) non-zero expectation: terms with \w\ = n+l of non-zero expectation will 
generate, through the first term in Ve^m{u,v), terms of order — not consistent 
with an order n/2 integrator with global mean-square error of order ft,". If any such 
terms exist in R, their expectation must be included (which is a cheap additional 
computational cost) in the integrator, i.e. we should include E{Ky,)Vy, in ^. This 
will mean that the term left in R is (^K^^ — E(i^^))T4, which has zero expectation 
and contributes to the global error through mechanism (1) above. Further, terms 
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Kyj with \w\ = n + 2 of non-zero expectation will generate terms of order /i" in S"^, 
i.e. they will also contribute at leading order through the first term in V£.m(?i, v). 

The terms of non-zero expectation in R, which contribute at leading order to 
either appear as natural next order terms or through the higher order terms C2(rV'+ 
Vt) mentioned in the last section. We will see this explicitly in the Simulations 
section presently. We can, with a cheap additional computational cost, include 
them through their expectations in the integrators, so that when we compare their 
global errors, the corresponding terms left in the remainders have zero expectation 
and are higher order (and thus not involved in the comparison at leading order). 
Further, fortuitously, the terms of zero expectation which contribute in (1) through 
the last term in Vi,miu, v) are exactly the same for the stochastic Taylor and sinh- 
log integrators. This is true at all orders and is a result of the following lemma, and 
that for the stochastic Taylor and sinh-log expansions when \a\ — 1, then Ka = Ja- 

Lemma 5.1. Suppose a,w G A+ and \a\ = 1, then we have 

Proof. If \'w\ is even, the expectations on both sides are zero. If \w\ is odd, in the 
shuffle products aiuw and o l±j (a o w), pair off terms where the letter a appears in 
the same position in an individual term oi aujw and the reverse of an individual 
term of a m {aow). The pair, with the one-half factor, will have the same expectation 
as the corresponding term in shuffle product amw. □ 



(6) Simulations 

We will demonstrate the properties we proved for the sinh-log scries for numer- 
ical integration schemes of strong orders one and three-halves. We will consider 
a stochastic differential system with no drift, two driving Wiener processes and 
non-commuting governing linear vector fields ViO y = for i = 1,2. 

We focus on the strong order one case first, and extend the analytical computa- 
tions in Lord, Malham and Wiese (2008). With n = 2, and Ci = 1 and C2 = — \ + e, 
the mean-square excess E, for general e G M, given by 

E = h^{{Uu2 yoVB{e) (f/ii2 yo) + {U221 yoVB{e) {U221 yo)) ■ 

Here C/112 = (0^02,010201,020^,02)^ and C/221 = (02O1, 020102, Oio|, of)'^ are both 
4A'' X N real matrices and the 4A'' x 4A'' real matrix B{e) consists oi N x N blocks of 
the form 6(e) (8) In (here (8) denotes the Kronecker product) where In is the N x N 
identity matrix and 6(e) is 

-e(3e-li) -e(3e-|) -<3e - ii) -e(3e-i) 

-(e-i)(3e-^) -<3e-ii) A-(,- i)(3,-^) -3e(3e-A) 
V -3e(3£-^) -^(3e-i) -3e(3e-A) _5e(3e-l)y 

Each of the eigenvalues of 6(e) are N multiple eigenvalues for S(e). The eigenvalues 
of 6(e) are shown in Figure 1. The sinh-log expansion corresponds to e = 0, while 

the exponential Lie series corresponds to e = i. The eigenvalues of 6(0) arc ^ and 
zero (thrice) — confirming our general result for the sinh-log expansion. However, 
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Eigenvalues of matrix b(e) 
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Figure 1. The eigenvalues, Ai(e), i = 1, . . . , 4 of matrix fe(e), as a function of e. 

the eigenvalues of b{^) are 0.5264, 0.1667 and —0.0264. One negative eigenvalue 
means that there are matrices Oi and 02 and initial conditions yo for which the 
order one stochastic Taylor method is more accurate, in the mean-square sense, 
than the exponential Lie series method (for linear vector fields we also call this the 
Magnus method). Prom Figure 1, we deduce that for any small values of e away 
from zero, wc cannot guarantee E > for all possible governing vector fields and 
initial data. The strong order one sinh-log integrator is optimal in this sense. This 
is also true at the next order from Corollary 4.2. 

For our simulations we take N = 2 and set the coefficient matrices to be 

/0.105 -7.43\ , /-0.065 -9.44 

"^=^0.03 0.345 j ^''^ "^=1,-0.005 0.265 

In Figure 2, using these matrices, we plot the mean-square excess E^^ for the expo- 
nential Lie series and E^^ for the sinh-log series, as a function of the two components 
of yo- We see there are regions of the phase space where E^^ is negative — of course 
E^'^ is positive everywhere. Hence if the solution yt of the stochastic differential 
system governed by the vectors fields Vi o y = Uiy, i = 1,2, remains in the region 
where E^^ is negative, then the numerical scheme based on the order one exponen- 
tial Lie series is less accurate than the stochastic Taylor method. Note that for the 
stochastic Taylor method, we need to include the terms 

1l2/4i 22, 22, 4\ 
g/l [tti + 0^02 + 020^ + 02) 

in the integrator. These are the expectation of terms with \w\ = 4 which contribute 
at leading order in the global error (only), and which can be cheaply included in 
the stochastic Taylor integrator. For the exponential Lie series we include the terms 

^/i^([a2, [a2,ai]]ai + [ai, [ai,a2]]a2 + 02 [ai, [01,02]] + ai[a2, [02,01]]), 

m ■0'". These are non-zero expectation terms with \w\ =4 which contribute at 
leading order in the global error through —C2irip + tpr), where C2 = —5. In the 
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Figure 2. Contour plots of the mean-square excess as a function of the two components of 
the data yo = (iio,wo)^, for the strong order one example, for the exponential Lie series 
(left panel) and the sinh-log series (right panel). 



Figure 3 shows the global error versus time for all three integrators for the linear 
system. We used the global interval of integration [0,0.0002], starting with yo — 
(19.198,28.972)"^, and stepsize h = 2.5 x 10"^. With this initial data, the smaU 
global interval of integration means that all ten thousand paths simulated stayed 
within the region of the phase space where E^^ is negative in Figure 2. 

The error for the exponential Lie series integrator, we see in Figure 3, is larger 
than that for the stochastic Taylor integrator. The error for the sinh-log integrator 
is smaller, though only marginally so. In fact it is hardly discernible from the 
stochastic Taylor plot, so the middle panel shows a magnification of the left region 
of the plot in the top panel. We plot the differences between the errors in the 
lower panel to confirm the better performance of the sinh-log integrator over the 
global interval. Further, estimates for the local errors for the sinh-log and Lie series 
integrators from the data in Figure 3, of course, quantitatively match analytical 
estimates for the mean-square excess E above. 

Remark 5.1. Generically the Castell-Gaines method of strong order one markedly 
outperforms the sinh-log method (which itself outperforms the stochastic Taylor 
method more markedly). However, as we have seen, there are pathological cases for 
which this is not true. 

In Figure 4 we compare the global errors for the strong order three-halves sinh- 
log and stochastic Taylor methods, with governing linear vector fields with coeffi- 
cient matrices 



and initial data yo = (1, 1/2)'^. Again as expected we see that the stochastic Taylor 
method is less accurate than the sinh-log method for sufficienltly small stepsizes. 



same vein, for the sinh-log integrator, we include in ^p^^ the terms 

ll,2/o4i22i 2 I 2 i22ir)4\ 




and 
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Figure 3. Mean-square global error vs time plot for the sinh-log, exponential Lie (Magnus) 
and stochastic Taylor methods for the order one example. The top panel shows the error, 
and the middle panel a magnification of the left region of the plot in the top panel. The 
lower panel shows the diflFerences between, the global sinh-log and exponential Lie errors, 
and the error of stochastic Taylor method. 



Remark 5.2. There is one caveat we have not mentioned thusfar. Constructing the 
approximation (ft from tpt is in general nontriviaL For linear vector fields Vi o y = 
ai we know that ipt is simply a matrix, and we can straightforwardly construct 
(ft using the matrix square root. For general nonlinear vector fields, we have not 
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Figure 4. Global error vs stepsize plot for the sinh-log and stochastic Taylor methods of 
strong order three-halves example. 

as yet found a superior method to simply expanding the square root shown to a 
sufficient number of high degree terms. 

6. Concluding remarks 

We have shown that the mean-square remainder associated with the sinh-log series 
is always smaller than the corresponding stochastic Taylor mean-square remainder, 
when there is no drift, to all orders. Since the order one-half sinh-log numerical 
method is the same as the order one-half Castell-Gaines method, it trivially inherits 
the asymptotically efficient property as well (indeed, if we include a drift term 
as well). We have not endeavoured to prove asymptotic efficiency more generally. 
However, in Section 5 we demonstrated that for two driving Wiener processes, 
the order one sinh-log numerical method is optimal in the following sense. From 
Figure 1, we see that any deviation of e from zero will generate a negative eigenvalue 
for b{e). Consequently there exist vector fields such that the mean-square excess 
will be negative in regions of the phase space. Further, from Corollary 4.2, the 
order three-halves sinh-log integrator is also optimal in this sense. These results 
are only true when there is no drift, and it could be argued that our simulations 
and demonstrations are somewhat academic. However the results we proved have 
application in splitting methods for stochastic differential equations. For example, 
in stochastic volatility simulation in finance, Ninomiya & Victoir (2006) and Halley, 
Malham & Wiese (2008) simulate the Heston model for financial derivative pricing, 
and use splitting to preserve positivity for the volatility numerically. They employ 
a Strang splitting that separates the diffusion flow from the drift flow and requires 
a distinct simulation of the purely diffusion governed flow. 

Why is the sinh-log expansion the answer? This result is intimately tied to the 
mean-square error measure we chose. The terms in the remainder of any truncation 
contain multiple Stratonovich integrals. Associated with each one is a mean-square 
error. There is a structure underlying these expectations. The sinh-log expansion 
somehow encodes this structure in an optimal form, it emulates the stochastic 
Taylor information more concisely. The next question is of course, what is the best 
structure when we include drift? Answering this is our next plan of action. 
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